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Objectives: From the point of view of clinical data representation, this study attempted to identify obstacles in translating clini- 
cal narrative guidelines into computer interpretable format and integrating the guidelines with data in Electronic Health Records 
in China. Methods: Based on SAGE and K4CARE formulism, a Chinese clinical practice guidehne for hypertension was mod- 
eled in Protege by building an ontology that had three components: flowchart, node, and vMR. Meanwhile, data items imperative 
in Electronic Health Records for patients with hypertension were reviewed and compared with those from the ontology so as 
to identify conflicts and gaps between. Results: A set of flowcharts was built. A flowchart comprises three kinds of node: State, 
Decision, and Act, each has a set of attributes, including data input/ output that exports data items, which then were specified 
following ClinicalStatement of HL7 vMR. A total of 140 data items were extracted from the ontology. In modeling the guideline, 
some narratives were found too inexplicit to formulate, and encoding data was quite difficult. Additionally, it was found in the 
healthcare records that there were 8 data items left out, and 10 data items defined differently compared to the extracted data 
items. Conclusions: The obstacles in modeling a clinical guideline and integrating with data in Electronic Health Records in- 
clude narrative ambiguity of the guideline, gaps and inconsistencies in representing some data items between the guideline and 
the patient' records, and unavailability of a unified medical coding system. Therefore, collaborations among various participants 
in developing guidelines and Electronic Health Record specifications is needed in China. 
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I. Introduction 

To encourage evidence-based practice, clinical decision sup- 
port systems (CDSSs) that provide digitized clinical practice 
guidelines (CPGs) and critical pathways are being actively 
introduced into the medical field for the health professional's 
use in clinical settings [1,2]. CDSSs can also help people 
with chronic diseases manage their own health by provid- 
ing instant knowledge and recommendations [3,4]. One of 
the key steps in developing a CPG-based CDSS is building 
computer interpretable guidelines (CIGs), also called guide- 
line ontology through guideline representation languages, 
which define the declarative knowledge of complex medical 
pathways. Internationally, there are several languages and 
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models to represent guidelines, such as GLIF3, SAGE, PRO- 
forma, etc. [5,6]. No matter what kind of techniques a CDSS 
uses, it delivers alerts, reminders, and suggestions based on 
not only decision rules specified in CPGs but also relevant 
patient data describing patient's health status collected by 
Electronic Health Record (EHR) systems [7]. Therefore, a 
large amount of routine clinical data needed by a CDSS has 
to be represented in an interoperable way with unambiguous 
meanings, and it must be consistent with those in CIGs so as 
to be retrieved and used by CDSS. To represent clinical data 
in a standard way for CDSSs, HL7 developed a data model 
called virtual medical record (vMR) based on the HL7 Ref- 
erence Information Model (RIM) [8]. vMR represents clini- 
cal information inputs and outputs that can be exchanged 
between CDS engines and clinical information systems 
through mechanisms such as CDS services. The Domain 
Analysis Model (DAM) of vMR includes structural specifica- 
tions for inputs and outputs of CDSS engines, which consists 
of several information classes, such as observation, encoun- 
ter, problem, adverse reaction, goal, procedure, medication 
order, etc., each representing a specialization of the RIM Act 
class. vMR is employed by SAGE to support information 
communication between CDSSs and local clinical informa- 
tion systems [9] . 

Hypertension is a significant risk factor for peoples health. 
It is estimated that there are currently 200 million hyperten- 
sion patients in China [10,11]. To cope with the prevalence 
and the characteristic dangers of hypertension, the Chinese 
government has established a strategy of "focusing on preven- 
tion and transferring the healthcare downwards to primary 
health sectors" Consequently, hypertension has been one 
of major chronic diseases managed by primary healthcare 
organizations at the community level, and a clinical practice 
guideline specifically for primary healthcare of hyperten- 
sion was developed in 2009, in which general knowledge 
about the disease was documented comprehensively to guide 
general practitioners to monitor, evaluate, medicate, and 
instruct hypertension patients [10]. Meanwhile, the national 
government imposed a set of requirements for the delivery 
of public health services and required that the data needed 
for chronic disease management, including hypertension, 
must be recorded by EHR systems in primary healthcare or- 
ganizations for the purposes of evidence-based, continuing 
patient healthcare and public health poUcy making [12] . The 
data items and their formats in EHR systems that already 
exist or are implemented in primary healthcare faciUties are 
also stated in the national specification. 

Like many other clinical guidelines, the hypertension 
guideline has not been fully followed in daily healthcare 



practice. Several factors restricting physicians' adherence to 
clinical guidelines have been identified [13]. Investigations 
in other countries have revealed that data standardization 
is a critical factor for CDSS development [14-17]. In China, 
there are very few computerized guidelines to date in rou- 
tine clinical use. Furthermore, there have not been investi- 
gations yet on whether the existing narrative guidelines in 
Chinese are able to be computerized or whether the data 
that information systems collected following the announce- 
ment of national specifications can meet the needs of CDSS 
implementation. 

Therefore, this research tried to identify underlying ob- 
stacles in implementing clinical guidelines from the aspect 
of chnical data representation in the context of hypertension 
by investigating the medical statements in both the guideline 
and electronic patient record. Finally, we suggest some con- 
siderations that need to be taken into account in the devel- 
opment of clinical guideUnes and EHR systems in China. 

II. Case Description 

1. steps of Investigation 

We modeled the hypertension guideline based on SAGE 
and K4CARE formalism [18], and took Protege 3.4.3 as the 
modehng tool [19,20]. In representing related clinical data, 
we followed HL7 vMR DAM. The process is summarized as 
follows. 

First, we extracted and formalized the narrative guideline as 
an ontology. The ontology has three successive components: 
flowchart, node, and vMR. There are three kinds of nodes in 
flowchart: State, Decision, and Act, which are related to and 
instantiated in node. The attributes of a node were set as label, 
description, data input/output, and vMR class. Data items 
were abstracted from the attributes of data input/output and 
organized into relative classes of vMR. Because data items 
are instances of the classes they belong to, these data items 
were then defined by the attributes and their data types of the 
classes in vMR DAM [21]. The semantics of data items were 
further specified by reference to the guideline content. During 
the process of guideline modeling and data standardization, 
guideUne deficiencies in chnical descriptions were revealed. 

Second, we reviewed the data items and metadata defined 
in the national specification of patients' EHRs. By comparing 
them with those in the guideline ontology, conflicts and gaps 
were identified. 

Finally, based on the previous steps, obstacles in computer- 
izing guidelines were inferred and corresponding recom- 
mendations for the development of clinical guidelines and 
patient health record systems were made. 
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2. Ontology of Hypertension Guideline 

Based on clinical regulation, the hypertension guideline was 
disassembled into 6 parts: identification, risk stratification, 
classification of hypertension management, lifestyle inter- 
vention, medication, and care goals. These 6 parts and cor- 
responding flowcharts are shown in Table 1. A hiW flowchart 
and overall 32 pATtitioned flowcharts, which are of various 
granularities, were created based on the rules described in the 
textual guidehne. One of the flowcharts is shown in Figure 1, 
representing routine monitoring of blood pressure for adults. 

All the rules in the flowcharts and corresponding statements 
represented in each node of the flowcharts are conceptualized 
in node and vMR. Flowchart, node, and vMR comprise the 
guideline ontology regarding medical knowledge of hyperten- 
sion management. 

In structuring and conceptualizing the guideline, one prob- 
lem we encountered is inexplicit and imprecise narratives. 
Some statements, such as lack of physical exercise, long-term 



alcohol abuse, detect liver function when needed, increase 
examination frequency when disease worsens, etc., are quali- 
tative and general; thus, they are very difficult to represent 
specifically in node of the ontology. 

3. Data Items in the Guideline Ontology 

There are 140 data items extracted from node of the guide- 
line ontology, which are classified into 4 classes in the Clini- 
calStatement package of vMR DAM: 17 data items to Goal, 
45 to Observation, 13 to Problem, and 65 to SubstanceAdmin- 
istration. Some are shown in Table 2. 

4. Data Definitions in the Guideline Ontology 

In defining all the data items, we also tried to localize the 
values of some attributes according to codes, identifiers, vo- 
cabularies, etc., provided in the target guideline. Figure 2 is a 
composite Protege interface of browsing and editing classes 
and their instances. Taking the goal of blood pressure for 



Table 1. Content of hypertension guideline 



Clinical step 


Flowchart 


Identification 


Evaluation of occasionally observed high BP; routine monitoring of BP in adults (see Figure 1); 
measurement of BP in first encounter; measurement of BP in risk population; household BP 
measurement 


Stratification 


Evaluation of hypertension; observation of risk factors (high cholesterol, obesity, familial car- 
diovascular disease occurred at earlier age); impairment of target organs; identification of co- 
morbidity 


Classification 


Primary management; secondary management; intensive management 


Lifestyle intervention 


Salt intake; smoking; drinking; body weight; exercise; emotion control 


Medication 


Calcium antagonists, ACEI, ARB, diuretics, beta blockers, compound preparations 


Care goals 


General hypertension; aged hypertension; other special hypertension 



BP: blood pressure, ACEI: angiotensin-converting enzyme inhibitor, ARB; angiotensin receptor blocker. 
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Figure 1. Flowchart for routine moni- 
toring of BP in adults. BP: 
blood pressure, SBP: systolic 
BP, DBP: diastolic BP. 
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general hypertension patients as an example, we defined the 
data items by specifying 2 attributes: goalFocus and target- 
GoalValue. 

There were two major challenges in standardizing the data 
elements in localizing vMR DAM. The first challenge is how 
to tailor or specify the general definitions of vMR to satisfy 
the need to exactly represent the specific rules and concepts 
in the guideline. For example, targetGoalValue is defined 
with ANY, which is the most general and flexible HL7 data 
type. For the purpose of being more meaningful and com- 
puter processable, however, it would be better to have ANY 
be specified to PQ, QTY, BL, IVL, etc., that are capable of 
conveying explicit semantics, but such transformation is dif- 
ficult without both clinical and informatics knowledge. Sec- 
ondly, for the attributes whose values have been appointed 
to standard terminology or coding systems in vMR, such 
as SNOMED CT [22,23], they cannot be coded currently in 
China because of the unavailability of this standard. In Chi- 
na there is no alternative medical terminology system na- 
tionally. A similar example is LOINC, which has been widely 



used internationally [24] , but it has not been widely recog- 
nized and adopted in China. So far, there have been neither 
mappings among various locally defined coding systems nor 
nationally unified identifiers for medical observations. 

5. Data Items and Their Representations in Data Collec- 
tion Specification 

We reviewed the data items defined in the national specifica- 
tion of patients' EHRs of hypertension and analyzed the defi- 
nitions provided in support documents. By comparing them 
with the standardized data definitions mentioned above, we 
found that among the data items extracted from the guide- 
line, 8 items were left out in patients' records: age of family 
members when hypertension occurred, the cause of death of 
family members, measurement of urinary trace albumin and 
urinary albumin/ creatinine, daily intake of sodium, and tar- 
get for body weight control. Additionally, we found 10 data 
items that have definitions that are different from those of 
the guidehne. Taking active health problems as an example, 
the guideline lists the diseases as cerebrovascular disease, 



Table 2. Data items and their definitions in Goal of ClinicalStatement, vMR 



Data item (instance of the class) 



Attribute 



Medical observations: Measurement of systolic and dia- 
stolic pressure, waistline, body mass index, total choles- 
terol, fasting plasma glucose, glycosylated hemoglobin 

Lifestyle and eating habits: Daily intake of sodium food 
components and daily intake drinking preference and 
alcohol consumption; smoking; type, frequency and 
duration of physical exercise psychological state 



goalFocus: CD 

goalPursuitEfFectiveTime: IVL_TS [0..1] 
goalAchievementTargetTime: IVL_TS [0..1] 
targetGoalValue: ANY [0..1] 
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heart disease, diabetes or impaired glucose tolerance, reti- 
nopathy, kidney disease, and peripheral vascular disease, 
while the patients' records categorize them as cerebrovascu- 
lar, heart, neural, eye, kidney, and vascular diseases, as well 
as diseases of other systems. Another example is the kinship 
of family members. The guideline describes consanguinity 
as 1st, 2nd or 3rd generation, where patients' records include 
the relationships of father, mother, sibling and filial, which 
should all be grouped into 1st generation relationships. 

III. Discussion 

The clinical guidelines provided by real-time CDSSs require 
that they are seamlessly integrated with existing patient 
information systems to enable the automatic provision of 
advice at the time and place at which decisions are made 
[25]. Focusing on hypertension, by modeling the knowledge 
and rules, our research first developed a guideline ontology 
to formulate the guideline and define the data items it con- 
tains. Secondly, we reviewed the health record specification 
for hypertension patients and compared the data definitions 
with those in the ontology to investigate whether they are 
mutually compatible in scope and semantics. Our investiga- 
tion demonstrated that it is basically feasible to represent 
the guideline in a computer interpretable format. However, 
problems arose in defining semantics for some data items 
because of narrative ambiguity of the guideline, gaps and 
inconsistencies in representing some data items between the 
guideline and patient' records, and unavailability of unified 
medical coding systems. These problems might be common 
and need to be dealt with when any clinical guidelines are to 
be computerized. Accordingly, the following major sugges- 
tions for the development of guidelines and data specifica- 
tions can be made. 

Participation of people in different disciplines and at vari- 
ous levels is vital. Firstly, the development of a guideline 
requires input from, not only sufficiently qualified medi- 
cal professionals who are able to make the right decisions 
in complex clinical situations, but also primary healthcare 
practitioners who need definitive directives when facing 
multiple choices. The participation of primary healthcare 
practitioners is even more important in the development of 
guidelines that target the management of long-term chronic 
diseases, such as hypertension, where general practitioners 
play leading roles. Therefore, the statements in a guideline 
should be more precise or knowledge intensive, making 
computerized CPGs and CDSSs more usable for people who 
really need them. Secondly, disease management specifica- 
tions, which in China are designed mainly by public health 
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agencies and address data recording issues, also need to be 
harmonious and consistent with the guidelines developed by 
clinical experts so as to facilitate guideline implementation. 
Obviously, this harmonization relies on effective commu- 
nication and collaboration among all participants. Thirdly, 
difficulties in translating narrative guidelines to a computer 
interpretable format and in normalizing data definitions re- 
sult partly from a lack of informaticians in the development 
of guideline and patients' record specifications. By identify- 
ing what kind of data standards or coding systems will be 
required in computerizing a specific guideline and whether 
they are available and applicable, informatics experts can 
certainly be helpful in solving some problems that have been 
identified in this research. 

More attention should be paid to data standard develop- 
ment and adoption issues. Computable representation of 
clinical information requires lots of standards, especially 
medical terminologies so as to name, identify, and code 
medical concepts consistently. Unfortunately, few such 
standards have been developed domestically in China, and 
international standards have not been available so far. There- 
fore, we suggest that the government, healthcare, and health 
information communities make great effort to identify 
strategies, mechanisms, and technical solutions for clini- 
cal data standardization. Otherwise, it will be impossible 
to implement the guideline by CDSSs and to integrate data 
across various health information systems to build patient- 
centered, longitudinal EHR and cHnical data repositories. 

This study had several limitations. The guideline ontol- 
ogy built in this study is based on a guideline document 
modeled by our research team. It remains to be validated 
by chnical professionals in terms of its meaningfulness and 
accuracy. Additionally, this research investigated clinical 
data standardization issues only by focusing on the example 
of hypertension. It cannot be considered comprehensive, 
and there may be some other significant challenges remain 
unidentified. After all, integrating all the related data and 
knowledge in a CDSS has proven to be very complicated, 
while the variety of health information systems and clini- 
cal guidelines makes it even more difficult. Thus, the data 
standardization issue discussed herein is far from enough to 
execute a CDSS. 
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